Dataset statistics
| Number of variables | 28 |
|---|---|
| Number of observations | 4916 |
| Missing cells | 2654 |
| Missing cells (%) | 1.9% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.1 MiB |
| Average record size in memory | 224.0 B |
Variable types
| Categorical | 12 |
|---|---|
| Numeric | 16 |
movie_title has a high cardinality: 4916 distinct values | High cardinality |
director_name has a high cardinality: 2397 distinct values | High cardinality |
actor_2_name has a high cardinality: 3030 distinct values | High cardinality |
genres has a high cardinality: 914 distinct values | High cardinality |
actor_1_name has a high cardinality: 2095 distinct values | High cardinality |
actor_3_name has a high cardinality: 3519 distinct values | High cardinality |
plot_keywords has a high cardinality: 4756 distinct values | High cardinality |
movie_imdb_link has a high cardinality: 4916 distinct values | High cardinality |
country has a high cardinality: 65 distinct values | High cardinality |
actor_1_facebook_likes is highly correlated with cast_total_facebook_likes | High correlation |
cast_total_facebook_likes is highly correlated with actor_1_facebook_likes | High correlation |
director_name has 102 (2.1%) missing values | Missing |
director_facebook_likes has 102 (2.1%) missing values | Missing |
gross has 862 (17.5%) missing values | Missing |
plot_keywords has 152 (3.1%) missing values | Missing |
content_rating has 300 (6.1%) missing values | Missing |
budget has 484 (9.8%) missing values | Missing |
title_year has 106 (2.2%) missing values | Missing |
aspect_ratio has 326 (6.6%) missing values | Missing |
budget is highly skewed (γ1 = 25.36637236) | Skewed |
movie_title is uniformly distributed | Uniform |
actor_3_name is uniformly distributed | Uniform |
plot_keywords is uniformly distributed | Uniform |
movie_imdb_link is uniformly distributed | Uniform |
movie_title has unique values | Unique |
movie_imdb_link has unique values | Unique |
director_facebook_likes has 877 (17.8%) zeros | Zeros |
actor_3_facebook_likes has 89 (1.8%) zeros | Zeros |
facenumber_in_poster has 2089 (42.5%) zeros | Zeros |
actor_2_facebook_likes has 55 (1.1%) zeros | Zeros |
movie_facebook_likes has 2130 (43.3%) zeros | Zeros |
Reproduction
| Analysis started | 2021-04-07 01:46:02.115881 |
|---|---|
| Analysis finished | 2021-04-07 01:46:37.073113 |
| Duration | 34.96 seconds |
| Software version | pandas-profiling v2.11.0 |
| Download configuration | config.yaml |
| Distinct | 4916 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 38.5 KiB |
| Les couloirs du temps: Les visiteurs II | 1 |
|---|---|
| True Grit | 1 |
| Striptease | 1 |
| Rio 2 | 1 |
| Christmas with the Kranks | 1 |
| Other values (4911) |
Length
| Max length | 86 |
|---|---|
| Median length | 14 |
| Mean length | 15.3445891 |
| Min length | 1 |
Characters and Unicode
| Total characters | 75434 |
|---|---|
| Distinct characters | 96 |
| Distinct categories | 13 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 4916 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | Avatar |
|---|---|
| 2nd row | Pirates of the Caribbean: At World's End |
| 3rd row | Spectre |
| 4th row | The Dark Knight Rises |
| 5th row | Star Wars: Episode VII - The Force Awakens |
| Value | Count | Frequency (%) |
| Les couloirs du temps: Les visiteurs II | 1 | < 0.1% |
| True Grit | 1 | < 0.1% |
| Striptease | 1 | < 0.1% |
| Rio 2 | 1 | < 0.1% |
| Christmas with the Kranks | 1 | < 0.1% |
| Street Fighter | 1 | < 0.1% |
| Yentl | 1 | < 0.1% |
| Dragonfly | 1 | < 0.1% |
| Unnatural | 1 | < 0.1% |
| Pinocchio | 1 | < 0.1% |
| Other values (4906) | 4906 |
| Value | Count | Frequency (%) |
| the | 1555 | 11.4% |
| of | 473 | 3.5% |
| a | 185 | 1.4% |
| and | 145 | 1.1% |
| in | 121 | 0.9% |
| to | 106 | 0.8% |
| 2 | 103 | 0.8% |
| 80 | 0.6% | |
| man | 66 | 0.5% |
| love | 55 | 0.4% |
| Other values (4905) | 10759 |
Most occurring characters
| Value | Count | Frequency (%) |
| 8732 | 11.6% | |
| e | 7719 | 10.2% |
| a | 4737 | 6.3% |
| o | 4563 | 6.0% |
| r | 4048 | 5.4% |
| n | 4043 | 5.4% |
| i | 3862 | 5.1% |
| t | 3729 | 4.9% |
| s | 2934 | 3.9% |
| h | 2903 | 3.8% |
| Other values (86) | 28164 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 53165 | |
| Uppercase Letter | 11966 | 15.9% |
| Space Separator | 8732 | 11.6% |
| Other Punctuation | 943 | 1.3% |
| Decimal Number | 517 | 0.7% |
| Dash Punctuation | 91 | 0.1% |
| Open Punctuation | 5 | < 0.1% |
| Close Punctuation | 5 | < 0.1% |
| Currency Symbol | 4 | < 0.1% |
| Other Number | 2 | < 0.1% |
| Other values (3) | 4 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 7719 | |
| a | 4737 | 8.9% |
| o | 4563 | 8.6% |
| r | 4048 | 7.6% |
| n | 4043 | 7.6% |
| i | 3862 | 7.3% |
| t | 3729 | 7.0% |
| s | 2934 | 5.5% |
| h | 2903 | 5.5% |
| l | 2475 | 4.7% |
| Other values (25) | 12152 |
| Value | Count | Frequency (%) |
| T | 1678 | |
| S | 1034 | 8.6% |
| M | 813 | 6.8% |
| B | 764 | 6.4% |
| D | 710 | 5.9% |
| C | 672 | 5.6% |
| A | 652 | 5.4% |
| L | 563 | 4.7% |
| H | 552 | 4.6% |
| W | 497 | 4.2% |
| Other values (17) | 4031 |
| Value | Count | Frequency (%) |
| : | 366 | |
| ' | 229 | |
| . | 145 | 15.4% |
| , | 77 | 8.2% |
| & | 61 | 6.5% |
| ! | 32 | 3.4% |
| ? | 16 | 1.7% |
| / | 8 | 0.8% |
| * | 5 | 0.5% |
| # | 2 | 0.2% |
| Other values (2) | 2 | 0.2% |
| Value | Count | Frequency (%) |
| 2 | 145 | |
| 3 | 86 | |
| 1 | 82 | |
| 0 | 81 | |
| 4 | 35 | 6.8% |
| 8 | 21 | 4.1% |
| 5 | 21 | 4.1% |
| 9 | 17 | 3.3% |
| 7 | 15 | 2.9% |
| 6 | 14 | 2.7% |
| Value | Count | Frequency (%) |
| ¢ | 2 | |
| $ | 2 |
| Value | Count | Frequency (%) |
| ( | 3 | |
| [ | 2 |
| Value | Count | Frequency (%) |
| ) | 3 | |
| ] | 2 |
| Value | Count | Frequency (%) |
| 8732 |
| Value | Count | Frequency (%) |
| - | 91 |
| Value | Count | Frequency (%) |
| ½ | 2 |
| Value | Count | Frequency (%) |
| + | 2 |
| Value | Count | Frequency (%) |
| _ | 1 |
| Value | Count | Frequency (%) |
| ° | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 65131 | |
| Common | 10303 | 13.7% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 7719 | 11.9% |
| a | 4737 | 7.3% |
| o | 4563 | 7.0% |
| r | 4048 | 6.2% |
| n | 4043 | 6.2% |
| i | 3862 | 5.9% |
| t | 3729 | 5.7% |
| s | 2934 | 4.5% |
| h | 2903 | 4.5% |
| l | 2475 | 3.8% |
| Other values (52) | 24118 |
| Value | Count | Frequency (%) |
| 8732 | ||
| : | 366 | 3.6% |
| ' | 229 | 2.2% |
| 2 | 145 | 1.4% |
| . | 145 | 1.4% |
| - | 91 | 0.9% |
| 3 | 86 | 0.8% |
| 1 | 82 | 0.8% |
| 0 | 81 | 0.8% |
| , | 77 | 0.7% |
| Other values (24) | 269 | 2.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 75411 | |
| None | 23 | < 0.1% |
Most frequent character per block
| Value | Count | Frequency (%) |
| 8732 | 11.6% | |
| e | 7719 | 10.2% |
| a | 4737 | 6.3% |
| o | 4563 | 6.1% |
| r | 4048 | 5.4% |
| n | 4043 | 5.4% |
| i | 3862 | 5.1% |
| t | 3729 | 4.9% |
| s | 2934 | 3.9% |
| h | 2903 | 3.8% |
| Other values (72) | 28141 |
| Value | Count | Frequency (%) |
| é | 8 | |
| ¢ | 2 | 8.7% |
| ½ | 2 | 8.7% |
| · | 1 | 4.3% |
| à | 1 | 4.3% |
| Æ | 1 | 4.3% |
| ü | 1 | 4.3% |
| è | 1 | 4.3% |
| ä | 1 | 4.3% |
| á | 1 | 4.3% |
| Other values (4) | 4 |
color
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 19 |
| Missing (%) | 0.4% |
| Memory size | 38.5 KiB |
| Color | |
|---|---|
| Black and White | 204 |
Length
| Max length | 15 |
|---|---|
| Median length | 5 |
| Mean length | 5.416581581 |
| Min length | 5 |
Characters and Unicode
| Total characters | 26525 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Color |
|---|---|
| 2nd row | Color |
| 3rd row | Color |
| 4th row | Color |
| 5th row | Color |
| Value | Count | Frequency (%) |
| Color | 4693 | |
| Black and White | 204 | 4.1% |
| (Missing) | 19 | 0.4% |
| Value | Count | Frequency (%) |
| color | 4693 | |
| white | 204 | 3.8% |
| and | 204 | 3.8% |
| black | 204 | 3.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 9386 | |
| l | 4897 | |
| C | 4693 | |
| r | 4693 | |
| a | 408 | 1.5% |
| 408 | 1.5% | |
| B | 204 | 0.8% |
| c | 204 | 0.8% |
| k | 204 | 0.8% |
| n | 204 | 0.8% |
| Other values (6) | 1224 | 4.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 21016 | |
| Uppercase Letter | 5101 | 19.2% |
| Space Separator | 408 | 1.5% |
Most frequent character per category
| Value | Count | Frequency (%) |
| o | 9386 | |
| l | 4897 | |
| r | 4693 | |
| a | 408 | 1.9% |
| c | 204 | 1.0% |
| k | 204 | 1.0% |
| n | 204 | 1.0% |
| d | 204 | 1.0% |
| h | 204 | 1.0% |
| i | 204 | 1.0% |
| Other values (2) | 408 | 1.9% |
| Value | Count | Frequency (%) |
| C | 4693 | |
| B | 204 | 4.0% |
| W | 204 | 4.0% |
| Value | Count | Frequency (%) |
| 408 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 26117 | |
| Common | 408 | 1.5% |
Most frequent character per script
| Value | Count | Frequency (%) |
| o | 9386 | |
| l | 4897 | |
| C | 4693 | |
| r | 4693 | |
| a | 408 | 1.6% |
| B | 204 | 0.8% |
| c | 204 | 0.8% |
| k | 204 | 0.8% |
| n | 204 | 0.8% |
| d | 204 | 0.8% |
| Other values (5) | 1020 | 3.9% |
| Value | Count | Frequency (%) |
| 408 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 26525 |
Most frequent character per block
| Value | Count | Frequency (%) |
| o | 9386 | |
| l | 4897 | |
| C | 4693 | |
| r | 4693 | |
| a | 408 | 1.5% |
| 408 | 1.5% | |
| B | 204 | 0.8% |
| c | 204 | 0.8% |
| k | 204 | 0.8% |
| n | 204 | 0.8% |
| Other values (6) | 1224 | 4.6% |
| Distinct | 2397 |
|---|---|
| Distinct (%) | 49.8% |
| Missing | 102 |
| Missing (%) | 2.1% |
| Memory size | 38.5 KiB |
| Steven Spielberg | 26 |
|---|---|
| Woody Allen | 22 |
| Clint Eastwood | 20 |
| Martin Scorsese | 20 |
| Spike Lee | 16 |
| Other values (2392) |
Length
| Max length | 32 |
|---|---|
| Median length | 13 |
| Mean length | 13.0847528 |
| Min length | 3 |
Characters and Unicode
| Total characters | 62990 |
|---|---|
| Distinct characters | 76 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 1523 ? |
|---|---|
| Unique (%) | 31.6% |
Sample
| 1st row | James Cameron |
|---|---|
| 2nd row | Gore Verbinski |
| 3rd row | Sam Mendes |
| 4th row | Christopher Nolan |
| 5th row | Doug Walker |
| Value | Count | Frequency (%) |
| Steven Spielberg | 26 | 0.5% |
| Woody Allen | 22 | 0.4% |
| Clint Eastwood | 20 | 0.4% |
| Martin Scorsese | 20 | 0.4% |
| Spike Lee | 16 | 0.3% |
| Ridley Scott | 16 | 0.3% |
| Renny Harlin | 15 | 0.3% |
| Steven Soderbergh | 15 | 0.3% |
| Oliver Stone | 14 | 0.3% |
| Tim Burton | 14 | 0.3% |
| Other values (2387) | 4636 | |
| (Missing) | 102 | 2.1% |
| Value | Count | Frequency (%) |
| john | 174 | 1.7% |
| david | 144 | 1.4% |
| michael | 125 | 1.2% |
| james | 87 | 0.9% |
| robert | 83 | 0.8% |
| peter | 81 | 0.8% |
| richard | 78 | 0.8% |
| paul | 73 | 0.7% |
| scott | 62 | 0.6% |
| lee | 56 | 0.6% |
| Other values (2965) | 9048 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 5936 | 9.4% |
| 5197 | 8.3% | |
| a | 5154 | 8.2% |
| n | 4547 | 7.2% |
| r | 4333 | 6.9% |
| o | 3684 | 5.8% |
| i | 3605 | 5.7% |
| l | 2917 | 4.6% |
| t | 2257 | 3.6% |
| s | 2040 | 3.2% |
| Other values (66) | 23320 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 47237 | |
| Uppercase Letter | 10221 | 16.2% |
| Space Separator | 5197 | 8.3% |
| Other Punctuation | 251 | 0.4% |
| Dash Punctuation | 84 | 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 5936 | |
| a | 5154 | |
| n | 4547 | |
| r | 4333 | 9.2% |
| o | 3684 | 7.8% |
| i | 3605 | 7.6% |
| l | 2917 | 6.2% |
| t | 2257 | 4.8% |
| s | 2040 | 4.3% |
| h | 1799 | 3.8% |
| Other values (31) | 10965 |
| Value | Count | Frequency (%) |
| S | 977 | 9.6% |
| J | 893 | 8.7% |
| M | 872 | 8.5% |
| R | 733 | 7.2% |
| C | 688 | 6.7% |
| B | 658 | 6.4% |
| D | 602 | 5.9% |
| A | 558 | 5.5% |
| L | 490 | 4.8% |
| P | 472 | 4.6% |
| Other values (21) | 3278 |
| Value | Count | Frequency (%) |
| . | 231 | |
| ' | 20 | 8.0% |
| Value | Count | Frequency (%) |
| 5197 |
| Value | Count | Frequency (%) |
| - | 84 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 57458 | |
| Common | 5532 | 8.8% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 5936 | 10.3% |
| a | 5154 | 9.0% |
| n | 4547 | 7.9% |
| r | 4333 | 7.5% |
| o | 3684 | 6.4% |
| i | 3605 | 6.3% |
| l | 2917 | 5.1% |
| t | 2257 | 3.9% |
| s | 2040 | 3.6% |
| h | 1799 | 3.1% |
| Other values (62) | 21186 |
| Value | Count | Frequency (%) |
| 5197 | ||
| . | 231 | 4.2% |
| - | 84 | 1.5% |
| ' | 20 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 62850 | |
| None | 140 | 0.2% |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 5936 | 9.4% |
| 5197 | 8.3% | |
| a | 5154 | 8.2% |
| n | 4547 | 7.2% |
| r | 4333 | 6.9% |
| o | 3684 | 5.9% |
| i | 3605 | 5.7% |
| l | 2917 | 4.6% |
| t | 2257 | 3.6% |
| s | 2040 | 3.2% |
| Other values (46) | 23180 |
| Value | Count | Frequency (%) |
| é | 43 | |
| á | 19 | |
| ó | 16 | 11.4% |
| ö | 16 | 11.4% |
| í | 8 | 5.7% |
| ñ | 7 | 5.0% |
| å | 6 | 4.3% |
| ç | 5 | 3.6% |
| É | 3 | 2.1% |
| ø | 2 | 1.4% |
| Other values (10) | 15 | 10.7% |
num_critic_for_reviews
Real number (ℝ≥0)
| Distinct | 528 |
|---|---|
| Distinct (%) | 10.8% |
| Missing | 49 |
| Missing (%) | 1.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 137.9889049 |
|---|---|
| Minimum | 1 |
| Maximum | 813 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 9 |
| Q1 | 49 |
| median | 108 |
| Q3 | 191 |
| 95-th percentile | 378 |
| Maximum | 813 |
| Range | 812 |
| Interquartile range (IQR) | 142 |
Descriptive statistics
| Standard deviation | 120.2393792 |
|---|---|
| Coefficient of variation (CV) | 0.8713699067 |
| Kurtosis | 3.048992494 |
| Mean | 137.9889049 |
| Median Absolute Deviation (MAD) | 67 |
| Skewness | 1.545198025 |
| Sum | 671592 |
| Variance | 14457.5083 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 40 | 0.8% |
| 5 | 36 | 0.7% |
| 9 | 36 | 0.7% |
| 12 | 34 | 0.7% |
| 8 | 34 | 0.7% |
| 10 | 34 | 0.7% |
| 16 | 33 | 0.7% |
| 81 | 31 | 0.6% |
| 29 | 30 | 0.6% |
| 43 | 30 | 0.6% |
| Other values (518) | 4529 | |
| (Missing) | 49 | 1.0% |
| Value | Count | Frequency (%) |
| 1 | 40 | |
| 2 | 26 | |
| 3 | 24 | |
| 4 | 29 | |
| 5 | 36 |
| Value | Count | Frequency (%) |
| 813 | 1 | |
| 775 | 1 | |
| 765 | 1 | |
| 750 | 1 | |
| 739 | 1 |
duration
Real number (ℝ≥0)
| Distinct | 191 |
|---|---|
| Distinct (%) | 3.9% |
| Missing | 15 |
| Missing (%) | 0.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 107.0907978 |
|---|---|
| Minimum | 7 |
| Maximum | 511 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.5 KiB |
Quantile statistics
| Minimum | 7 |
|---|---|
| 5-th percentile | 81 |
| Q1 | 93 |
| median | 103 |
| Q3 | 118 |
| 95-th percentile | 146 |
| Maximum | 511 |
| Range | 504 |
| Interquartile range (IQR) | 25 |
Descriptive statistics
| Standard deviation | 25.28601531 |
|---|---|
| Coefficient of variation (CV) | 0.2361175361 |
| Kurtosis | 22.79584339 |
| Mean | 107.0907978 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | 2.357977091 |
| Sum | 524852 |
| Variance | 639.3825704 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 90 | 160 | 3.3% |
| 100 | 137 | 2.8% |
| 98 | 135 | 2.7% |
| 101 | 132 | 2.7% |
| 97 | 131 | 2.7% |
| 93 | 125 | 2.5% |
| 99 | 123 | 2.5% |
| 94 | 122 | 2.5% |
| 95 | 121 | 2.5% |
| 96 | 111 | 2.3% |
| Other values (181) | 3604 |
| Value | Count | Frequency (%) |
| 7 | 2 | < 0.1% |
| 11 | 1 | < 0.1% |
| 14 | 1 | < 0.1% |
| 20 | 1 | < 0.1% |
| 22 | 7 |
| Value | Count | Frequency (%) |
| 511 | 1 | |
| 334 | 1 | |
| 330 | 1 | |
| 325 | 1 | |
| 300 | 1 |
| Distinct | 435 |
|---|---|
| Distinct (%) | 9.0% |
| Missing | 102 |
| Missing (%) | 2.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 691.0145409 |
|---|---|
| Minimum | 0 |
| Maximum | 23000 |
| Zeros | 877 |
| Zeros (%) | 17.8% |
| Memory size | 38.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 7 |
| median | 48 |
| Q3 | 189.75 |
| 95-th percentile | 982.45 |
| Maximum | 23000 |
| Range | 23000 |
| Interquartile range (IQR) | 182.75 |
Descriptive statistics
| Standard deviation | 2832.954125 |
|---|---|
| Coefficient of variation (CV) | 4.099702621 |
| Kurtosis | 26.97306552 |
| Mean | 691.0145409 |
| Median Absolute Deviation (MAD) | 48 |
| Skewness | 5.205766151 |
| Sum | 3326544 |
| Variance | 8025629.073 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 877 | 17.8% |
| 3 | 69 | 1.4% |
| 6 | 66 | 1.3% |
| 7 | 63 | 1.3% |
| 2 | 63 | 1.3% |
| 4 | 60 | 1.2% |
| 11 | 57 | 1.2% |
| 10 | 53 | 1.1% |
| 5 | 52 | 1.1% |
| 8 | 51 | 1.0% |
| Other values (425) | 3403 | |
| (Missing) | 102 | 2.1% |
| Value | Count | Frequency (%) |
| 0 | 877 | |
| 2 | 63 | 1.3% |
| 3 | 69 | 1.4% |
| 4 | 60 | 1.2% |
| 5 | 52 | 1.1% |
| Value | Count | Frequency (%) |
| 23000 | 1 | < 0.1% |
| 22000 | 8 | |
| 21000 | 10 | |
| 20000 | 1 | < 0.1% |
| 18000 | 4 | 0.1% |
| Distinct | 906 |
|---|---|
| Distinct (%) | 18.5% |
| Missing | 23 |
| Missing (%) | 0.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 631.2763131 |
|---|---|
| Minimum | 0 |
| Maximum | 23000 |
| Zeros | 89 |
| Zeros (%) | 1.8% |
| Memory size | 38.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 9 |
| Q1 | 132 |
| median | 366 |
| Q3 | 633 |
| 95-th percentile | 1000 |
| Maximum | 23000 |
| Range | 23000 |
| Interquartile range (IQR) | 501 |
Descriptive statistics
| Standard deviation | 1625.874802 |
|---|---|
| Coefficient of variation (CV) | 2.575535891 |
| Kurtosis | 63.57766761 |
| Mean | 631.2763131 |
| Median Absolute Deviation (MAD) | 246 |
| Skewness | 7.441519978 |
| Sum | 3088835 |
| Variance | 2643468.87 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 1000 | 118 | 2.4% |
| 0 | 89 | 1.8% |
| 11000 | 27 | 0.5% |
| 2000 | 26 | 0.5% |
| 3 | 26 | 0.5% |
| 3000 | 24 | 0.5% |
| 4 | 21 | 0.4% |
| 826 | 21 | 0.4% |
| 7 | 21 | 0.4% |
| 2 | 20 | 0.4% |
| Other values (896) | 4500 | |
| (Missing) | 23 | 0.5% |
| Value | Count | Frequency (%) |
| 0 | 89 | |
| 2 | 20 | 0.4% |
| 3 | 26 | 0.5% |
| 4 | 21 | 0.4% |
| 5 | 18 | 0.4% |
| Value | Count | Frequency (%) |
| 23000 | 2 | |
| 20000 | 1 | < 0.1% |
| 19000 | 4 | |
| 17000 | 1 | < 0.1% |
| 16000 | 3 |
| Distinct | 3030 |
|---|---|
| Distinct (%) | 61.8% |
| Missing | 13 |
| Missing (%) | 0.3% |
| Memory size | 38.5 KiB |
| Morgan Freeman | 18 |
|---|---|
| Charlize Theron | 14 |
| Brad Pitt | 13 |
| Meryl Streep | 11 |
| Adam Sandler | 10 |
| Other values (3025) |
Length
| Max length | 28 |
|---|---|
| Median length | 13 |
| Mean length | 13.07526004 |
| Min length | 3 |
Characters and Unicode
| Total characters | 64108 |
|---|---|
| Distinct characters | 80 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 2125 ? |
|---|---|
| Unique (%) | 43.3% |
Sample
| 1st row | Joel David Moore |
|---|---|
| 2nd row | Orlando Bloom |
| 3rd row | Rory Kinnear |
| 4th row | Christian Bale |
| 5th row | Rob Walker |
| Value | Count | Frequency (%) |
| Morgan Freeman | 18 | 0.4% |
| Charlize Theron | 14 | 0.3% |
| Brad Pitt | 13 | 0.3% |
| Meryl Streep | 11 | 0.2% |
| Adam Sandler | 10 | 0.2% |
| James Franco | 10 | 0.2% |
| Will Ferrell | 9 | 0.2% |
| Scott Glenn | 9 | 0.2% |
| Bruce Willis | 9 | 0.2% |
| Jada Pinkett Smith | 8 | 0.2% |
| Other values (3020) | 4792 | |
| (Missing) | 13 | 0.3% |
| Value | Count | Frequency (%) |
| michael | 102 | 1.0% |
| david | 58 | 0.6% |
| john | 55 | 0.5% |
| james | 52 | 0.5% |
| scott | 51 | 0.5% |
| tom | 50 | 0.5% |
| jason | 41 | 0.4% |
| robert | 41 | 0.4% |
| kevin | 40 | 0.4% |
| bruce | 39 | 0.4% |
| Other values (3823) | 9614 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 6048 | 9.4% |
| a | 5789 | 9.0% |
| 5240 | 8.2% | |
| n | 4623 | 7.2% |
| r | 4295 | 6.7% |
| i | 3930 | 6.1% |
| o | 3553 | 5.5% |
| l | 3329 | 5.2% |
| t | 2287 | 3.6% |
| s | 2106 | 3.3% |
| Other values (70) | 22908 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 48195 | |
| Uppercase Letter | 10422 | 16.3% |
| Space Separator | 5240 | 8.2% |
| Other Punctuation | 182 | 0.3% |
| Dash Punctuation | 63 | 0.1% |
| Decimal Number | 6 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 6048 | |
| a | 5789 | |
| n | 4623 | |
| r | 4295 | |
| i | 3930 | 8.2% |
| o | 3553 | 7.4% |
| l | 3329 | 6.9% |
| t | 2287 | 4.7% |
| s | 2106 | 4.4% |
| h | 1761 | 3.7% |
| Other values (38) | 10474 |
| Value | Count | Frequency (%) |
| M | 972 | 9.3% |
| S | 799 | 7.7% |
| C | 793 | 7.6% |
| B | 761 | 7.3% |
| J | 750 | 7.2% |
| D | 646 | 6.2% |
| A | 625 | 6.0% |
| R | 580 | 5.6% |
| L | 501 | 4.8% |
| T | 448 | 4.3% |
| Other values (16) | 3547 |
| Value | Count | Frequency (%) |
| . | 119 | |
| ' | 63 |
| Value | Count | Frequency (%) |
| 5 | 3 | |
| 0 | 3 |
| Value | Count | Frequency (%) |
| 5240 |
| Value | Count | Frequency (%) |
| - | 63 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 58617 | |
| Common | 5491 | 8.6% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 6048 | 10.3% |
| a | 5789 | 9.9% |
| n | 4623 | 7.9% |
| r | 4295 | 7.3% |
| i | 3930 | 6.7% |
| o | 3553 | 6.1% |
| l | 3329 | 5.7% |
| t | 2287 | 3.9% |
| s | 2106 | 3.6% |
| h | 1761 | 3.0% |
| Other values (64) | 20896 |
| Value | Count | Frequency (%) |
| 5240 | ||
| . | 119 | 2.2% |
| - | 63 | 1.1% |
| ' | 63 | 1.1% |
| 5 | 3 | 0.1% |
| 0 | 3 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 63989 | |
| None | 119 | 0.2% |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 6048 | 9.5% |
| a | 5789 | 9.0% |
| 5240 | 8.2% | |
| n | 4623 | 7.2% |
| r | 4295 | 6.7% |
| i | 3930 | 6.1% |
| o | 3553 | 5.6% |
| l | 3329 | 5.2% |
| t | 2287 | 3.6% |
| s | 2106 | 3.3% |
| Other values (48) | 22789 |
| Value | Count | Frequency (%) |
| é | 43 | |
| í | 12 | 10.1% |
| á | 10 | 8.4% |
| ë | 8 | 6.7% |
| ø | 6 | 5.0% |
| ó | 6 | 5.0% |
| å | 4 | 3.4% |
| ü | 4 | 3.4% |
| ç | 3 | 2.5% |
| û | 3 | 2.5% |
| Other values (12) | 20 |
| Distinct | 877 |
|---|---|
| Distinct (%) | 17.9% |
| Missing | 7 |
| Missing (%) | 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6494.488491 |
|---|---|
| Minimum | 0 |
| Maximum | 640000 |
| Zeros | 26 |
| Zeros (%) | 0.5% |
| Memory size | 38.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 93 |
| Q1 | 607 |
| median | 982 |
| Q3 | 11000 |
| 95-th percentile | 23000 |
| Maximum | 640000 |
| Range | 640000 |
| Interquartile range (IQR) | 10393 |
Descriptive statistics
| Standard deviation | 15106.98688 |
|---|---|
| Coefficient of variation (CV) | 2.326124206 |
| Kurtosis | 685.6853809 |
| Mean | 6494.488491 |
| Median Absolute Deviation (MAD) | 738 |
| Skewness | 19.27602317 |
| Sum | 31881444 |
| Variance | 228221052.7 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 1000 | 436 | 8.9% |
| 11000 | 206 | 4.2% |
| 2000 | 189 | 3.8% |
| 3000 | 150 | 3.1% |
| 12000 | 131 | 2.7% |
| 13000 | 123 | 2.5% |
| 14000 | 120 | 2.4% |
| 10000 | 109 | 2.2% |
| 18000 | 106 | 2.2% |
| 22000 | 80 | 1.6% |
| Other values (867) | 3259 |
| Value | Count | Frequency (%) |
| 0 | 26 | |
| 2 | 8 | 0.2% |
| 3 | 4 | 0.1% |
| 4 | 2 | < 0.1% |
| 5 | 7 | 0.1% |
| Value | Count | Frequency (%) |
| 640000 | 1 | < 0.1% |
| 260000 | 3 | 0.1% |
| 164000 | 2 | < 0.1% |
| 137000 | 2 | < 0.1% |
| 87000 | 8 |
| Distinct | 4033 |
|---|---|
| Distinct (%) | 99.5% |
| Missing | 862 |
| Missing (%) | 17.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 47644514.53 |
|---|---|
| Minimum | 162 |
| Maximum | 760505847 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.5 KiB |
Quantile statistics
| Minimum | 162 |
|---|---|
| 5-th percentile | 96209.7 |
| Q1 | 5019656.25 |
| median | 25043962 |
| Q3 | 61108412.75 |
| 95-th percentile | 177424688.4 |
| Maximum | 760505847 |
| Range | 760505685 |
| Interquartile range (IQR) | 56088756.5 |
Descriptive statistics
| Standard deviation | 67372553.83 |
|---|---|
| Coefficient of variation (CV) | 1.41406738 |
| Kurtosis | 14.93226526 |
| Mean | 47644514.53 |
| Median Absolute Deviation (MAD) | 22912754.5 |
| Skewness | 3.12698398 |
| Sum | 1.931508619 × 1011 |
| Variance | 4.53906101 × 1015 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 3000000 | 3 | 0.1% |
| 8000000 | 3 | 0.1% |
| 1000000 | 2 | < 0.1% |
| 30400000 | 2 | < 0.1% |
| 36200000 | 2 | < 0.1% |
| 32000000 | 2 | < 0.1% |
| 26400000 | 2 | < 0.1% |
| 25000000 | 2 | < 0.1% |
| 76400000 | 2 | < 0.1% |
| 800000 | 2 | < 0.1% |
| Other values (4023) | 4032 | |
| (Missing) | 862 | 17.5% |
| Value | Count | Frequency (%) |
| 162 | 1 | |
| 703 | 1 | |
| 721 | 1 | |
| 828 | 1 | |
| 1111 | 1 |
| Value | Count | Frequency (%) |
| 760505847 | 1 | |
| 658672302 | 1 | |
| 652177271 | 1 | |
| 623279547 | 1 | |
| 533316061 | 1 |
| Distinct | 914 |
|---|---|
| Distinct (%) | 18.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 38.5 KiB |
| Drama | 233 |
|---|---|
| Comedy | 205 |
| Comedy|Drama | 189 |
| Comedy|Drama|Romance | 185 |
| Comedy|Romance | 157 |
| Other values (909) |
Length
| Max length | 64 |
|---|---|
| Median length | 20 |
| Mean length | 20.28417413 |
| Min length | 5 |
Characters and Unicode
| Total characters | 99717 |
|---|---|
| Distinct characters | 35 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 507 ? |
|---|---|
| Unique (%) | 10.3% |
Sample
| 1st row | Action|Adventure|Fantasy|Sci-Fi |
|---|---|
| 2nd row | Action|Adventure|Fantasy |
| 3rd row | Action|Adventure|Thriller |
| 4th row | Action|Thriller |
| 5th row | Documentary |
| Value | Count | Frequency (%) |
| Drama | 233 | 4.7% |
| Comedy | 205 | 4.2% |
| Comedy|Drama | 189 | 3.8% |
| Comedy|Drama|Romance | 185 | 3.8% |
| Comedy|Romance | 157 | 3.2% |
| Drama|Romance | 150 | 3.1% |
| Crime|Drama|Thriller | 98 | 2.0% |
| Horror | 67 | 1.4% |
| Action|Crime|Drama|Thriller | 65 | 1.3% |
| Drama|Thriller | 62 | 1.3% |
| Other values (904) | 3505 |
| Value | Count | Frequency (%) |
| drama | 233 | 4.7% |
| comedy | 205 | 4.2% |
| comedy|drama | 189 | 3.8% |
| comedy|drama|romance | 185 | 3.8% |
| comedy|romance | 157 | 3.2% |
| drama|romance | 150 | 3.1% |
| crime|drama|thriller | 98 | 2.0% |
| horror | 67 | 1.4% |
| action|crime|drama|thriller | 65 | 1.3% |
| action|crime|thriller | 62 | 1.3% |
| Other values (904) | 3505 |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 10220 | 10.2% |
| | | 9208 | 9.2% |
| a | 8846 | 8.9% |
| e | 7738 | 7.8% |
| m | 7234 | 7.3% |
| i | 6394 | 6.4% |
| o | 6163 | 6.2% |
| y | 4550 | 4.6% |
| n | 4363 | 4.4% |
| t | 3910 | 3.9% |
| Other values (25) | 31091 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 75179 | |
| Uppercase Letter | 14728 | 14.8% |
| Math Symbol | 9208 | 9.2% |
| Dash Punctuation | 602 | 0.6% |
Most frequent character per category
| Value | Count | Frequency (%) |
| r | 10220 | |
| a | 8846 | |
| e | 7738 | |
| m | 7234 | |
| i | 6394 | |
| o | 6163 | |
| y | 4550 | 6.1% |
| n | 4363 | 5.8% |
| t | 3910 | 5.2% |
| l | 3399 | 4.5% |
| Other values (9) | 12362 |
| Value | Count | Frequency (%) |
| C | 2715 | |
| D | 2654 | |
| A | 2241 | |
| F | 1716 | |
| T | 1365 | |
| R | 1086 | 7.4% |
| M | 828 | 5.6% |
| S | 776 | 5.3% |
| H | 740 | 5.0% |
| W | 304 | 2.1% |
| Other values (4) | 303 | 2.1% |
| Value | Count | Frequency (%) |
| | | 9208 |
| Value | Count | Frequency (%) |
| - | 602 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 89907 | |
| Common | 9810 | 9.8% |
Most frequent character per script
| Value | Count | Frequency (%) |
| r | 10220 | 11.4% |
| a | 8846 | 9.8% |
| e | 7738 | 8.6% |
| m | 7234 | 8.0% |
| i | 6394 | 7.1% |
| o | 6163 | 6.9% |
| y | 4550 | 5.1% |
| n | 4363 | 4.9% |
| t | 3910 | 4.3% |
| l | 3399 | 3.8% |
| Other values (23) | 27090 |
| Value | Count | Frequency (%) |
| | | 9208 | |
| - | 602 | 6.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 99717 |
Most frequent character per block
| Value | Count | Frequency (%) |
| r | 10220 | 10.2% |
| | | 9208 | 9.2% |
| a | 8846 | 8.9% |
| e | 7738 | 7.8% |
| m | 7234 | 7.3% |
| i | 6394 | 6.4% |
| o | 6163 | 6.2% |
| y | 4550 | 4.6% |
| n | 4363 | 4.4% |
| t | 3910 | 3.9% |
| Other values (25) | 31091 |
| Distinct | 2095 |
|---|---|
| Distinct (%) | 42.7% |
| Missing | 7 |
| Missing (%) | 0.1% |
| Memory size | 38.5 KiB |
| Robert De Niro | 48 |
|---|---|
| Johnny Depp | 36 |
| Nicolas Cage | 32 |
| J.K. Simmons | 29 |
| Matt Damon | 29 |
| Other values (2090) |
Length
| Max length | 27 |
|---|---|
| Median length | 13 |
| Mean length | 13.20228152 |
| Min length | 4 |
Characters and Unicode
| Total characters | 64810 |
|---|---|
| Distinct characters | 76 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 1379 ? |
|---|---|
| Unique (%) | 28.1% |
Sample
| 1st row | CCH Pounder |
|---|---|
| 2nd row | Johnny Depp |
| 3rd row | Christoph Waltz |
| 4th row | Tom Hardy |
| 5th row | Doug Walker |
| Value | Count | Frequency (%) |
| Robert De Niro | 48 | 1.0% |
| Johnny Depp | 36 | 0.7% |
| Nicolas Cage | 32 | 0.7% |
| J.K. Simmons | 29 | 0.6% |
| Matt Damon | 29 | 0.6% |
| Denzel Washington | 29 | 0.6% |
| Bruce Willis | 28 | 0.6% |
| Steve Buscemi | 27 | 0.5% |
| Liam Neeson | 27 | 0.5% |
| Harrison Ford | 27 | 0.5% |
| Other values (2085) | 4597 |
| Value | Count | Frequency (%) |
| robert | 106 | 1.0% |
| tom | 90 | 0.9% |
| michael | 88 | 0.9% |
| jason | 57 | 0.6% |
| de | 56 | 0.5% |
| james | 52 | 0.5% |
| steve | 50 | 0.5% |
| bruce | 49 | 0.5% |
| niro | 48 | 0.5% |
| jr | 47 | 0.5% |
| Other values (2885) | 9539 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 6058 | 9.3% |
| a | 5606 | 8.6% |
| 5273 | 8.1% | |
| n | 4700 | 7.3% |
| r | 4215 | 6.5% |
| i | 4140 | 6.4% |
| o | 3821 | 5.9% |
| l | 3242 | 5.0% |
| t | 2506 | 3.9% |
| s | 2281 | 3.5% |
| Other values (66) | 22968 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 48806 | |
| Uppercase Letter | 10441 | 16.1% |
| Space Separator | 5273 | 8.1% |
| Other Punctuation | 217 | 0.3% |
| Dash Punctuation | 71 | 0.1% |
| Decimal Number | 2 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 6058 | |
| a | 5606 | |
| n | 4700 | |
| r | 4215 | |
| i | 4140 | 8.5% |
| o | 3821 | 7.8% |
| l | 3242 | 6.6% |
| t | 2506 | 5.1% |
| s | 2281 | 4.7% |
| h | 1747 | 3.6% |
| Other values (32) | 10490 |
| Value | Count | Frequency (%) |
| J | 918 | 8.8% |
| M | 894 | 8.6% |
| S | 831 | 8.0% |
| C | 799 | 7.7% |
| B | 729 | 7.0% |
| D | 706 | 6.8% |
| R | 617 | 5.9% |
| H | 511 | 4.9% |
| A | 496 | 4.8% |
| L | 479 | 4.6% |
| Other values (18) | 3461 |
| Value | Count | Frequency (%) |
| . | 173 | |
| ' | 44 | 20.3% |
| Value | Count | Frequency (%) |
| 5 | 1 | |
| 0 | 1 |
| Value | Count | Frequency (%) |
| 5273 |
| Value | Count | Frequency (%) |
| - | 71 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 59247 | |
| Common | 5563 | 8.6% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 6058 | 10.2% |
| a | 5606 | 9.5% |
| n | 4700 | 7.9% |
| r | 4215 | 7.1% |
| i | 4140 | 7.0% |
| o | 3821 | 6.4% |
| l | 3242 | 5.5% |
| t | 2506 | 4.2% |
| s | 2281 | 3.8% |
| h | 1747 | 2.9% |
| Other values (60) | 20931 |
| Value | Count | Frequency (%) |
| 5273 | ||
| . | 173 | 3.1% |
| - | 71 | 1.3% |
| ' | 44 | 0.8% |
| 5 | 1 | < 0.1% |
| 0 | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 64732 | |
| None | 78 | 0.1% |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 6058 | 9.4% |
| a | 5606 | 8.7% |
| 5273 | 8.1% | |
| n | 4700 | 7.3% |
| r | 4215 | 6.5% |
| i | 4140 | 6.4% |
| o | 3821 | 5.9% |
| l | 3242 | 5.0% |
| t | 2506 | 3.9% |
| s | 2281 | 3.5% |
| Other values (48) | 22890 |
| Value | Count | Frequency (%) |
| é | 19 | |
| ë | 14 | |
| á | 7 | 9.0% |
| í | 6 | 7.7% |
| ç | 5 | 6.4% |
| å | 5 | 6.4% |
| ø | 4 | 5.1% |
| Ó | 3 | 3.8% |
| ô | 2 | 2.6% |
| à | 2 | 2.6% |
| Other values (8) | 11 |
num_voted_users
Real number (ℝ≥0)
| Distinct | 4750 |
|---|---|
| Distinct (%) | 96.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 82644.92494 |
|---|---|
| Minimum | 5 |
| Maximum | 1689764 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.5 KiB |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 507.25 |
| Q1 | 8361.75 |
| median | 33132.5 |
| Q3 | 93772.75 |
| 95-th percentile | 330310 |
| Maximum | 1689764 |
| Range | 1689759 |
| Interquartile range (IQR) | 85411 |
Descriptive statistics
| Standard deviation | 138322.1625 |
|---|---|
| Coefficient of variation (CV) | 1.673692155 |
| Kurtosis | 24.91998064 |
| Mean | 82644.92494 |
| Median Absolute Deviation (MAD) | 29810.5 |
| Skewness | 4.074557576 |
| Sum | 406282451 |
| Variance | 1.913302065 × 1010 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 57 | 5 | 0.1% |
| 6 | 4 | 0.1% |
| 2541 | 3 | 0.1% |
| 38 | 3 | 0.1% |
| 53 | 3 | 0.1% |
| 3119 | 3 | 0.1% |
| 3665 | 3 | 0.1% |
| 8 | 3 | 0.1% |
| 162 | 3 | 0.1% |
| 3943 | 2 | < 0.1% |
| Other values (4740) | 4884 |
| Value | Count | Frequency (%) |
| 5 | 2 | |
| 6 | 4 | |
| 7 | 2 | |
| 8 | 3 | |
| 10 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1689764 | 1 | |
| 1676169 | 1 | |
| 1468200 | 1 | |
| 1347461 | 1 | |
| 1324680 | 1 |
| Distinct | 3960 |
|---|---|
| Distinct (%) | 80.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9579.815907 |
|---|---|
| Minimum | 0 |
| Maximum | 656730 |
| Zeros | 33 |
| Zeros (%) | 0.7% |
| Memory size | 38.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 173.75 |
| Q1 | 1394.75 |
| median | 3049 |
| Q3 | 13616.75 |
| 95-th percentile | 36483.75 |
| Maximum | 656730 |
| Range | 656730 |
| Interquartile range (IQR) | 12222 |
Descriptive statistics
| Standard deviation | 18164.31699 |
|---|---|
| Coefficient of variation (CV) | 1.896102928 |
| Kurtosis | 370.782513 |
| Mean | 9579.815907 |
| Median Absolute Deviation (MAD) | 2262.5 |
| Skewness | 13.12069073 |
| Sum | 47094375 |
| Variance | 329942411.7 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 33 | 0.7% |
| 5 | 7 | 0.1% |
| 2 | 6 | 0.1% |
| 29 | 5 | 0.1% |
| 2020 | 5 | 0.1% |
| 1044 | 5 | 0.1% |
| 2730 | 4 | 0.1% |
| 1554 | 4 | 0.1% |
| 81 | 4 | 0.1% |
| 1936 | 4 | 0.1% |
| Other values (3950) | 4839 |
| Value | Count | Frequency (%) |
| 0 | 33 | |
| 2 | 6 | 0.1% |
| 3 | 1 | < 0.1% |
| 4 | 2 | < 0.1% |
| 5 | 7 | 0.1% |
| Value | Count | Frequency (%) |
| 656730 | 1 | |
| 303717 | 1 | |
| 283939 | 1 | |
| 263584 | 1 | |
| 261818 | 1 |
| Distinct | 3519 |
|---|---|
| Distinct (%) | 71.9% |
| Missing | 23 |
| Missing (%) | 0.5% |
| Memory size | 38.5 KiB |
| Steve Coogan | 8 |
|---|---|
| Stephen Root | 7 |
| Jon Gries | 7 |
| Ben Mendelsohn | 7 |
| Robert Duvall | 7 |
| Other values (3514) |
Length
| Max length | 29 |
|---|---|
| Median length | 13 |
| Mean length | 13.07888821 |
| Min length | 3 |
Characters and Unicode
| Total characters | 63995 |
|---|---|
| Distinct characters | 81 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 2704 ? |
|---|---|
| Unique (%) | 55.3% |
Sample
| 1st row | Wes Studi |
|---|---|
| 2nd row | Jack Davenport |
| 3rd row | Stephanie Sigman |
| 4th row | Joseph Gordon-Levitt |
| 5th row | Polly Walker |
| Value | Count | Frequency (%) |
| Steve Coogan | 8 | 0.2% |
| Stephen Root | 7 | 0.1% |
| Jon Gries | 7 | 0.1% |
| Ben Mendelsohn | 7 | 0.1% |
| Robert Duvall | 7 | 0.1% |
| Sam Shepard | 7 | 0.1% |
| Paul Sorvino | 6 | 0.1% |
| Anne Hathaway | 6 | 0.1% |
| Lois Maxwell | 6 | 0.1% |
| Kirsten Dunst | 6 | 0.1% |
| Other values (3509) | 4826 | |
| (Missing) | 23 | 0.5% |
| Value | Count | Frequency (%) |
| michael | 85 | 0.8% |
| john | 78 | 0.8% |
| david | 68 | 0.7% |
| james | 66 | 0.7% |
| robert | 45 | 0.4% |
| tom | 42 | 0.4% |
| kevin | 41 | 0.4% |
| paul | 39 | 0.4% |
| peter | 38 | 0.4% |
| scott | 36 | 0.4% |
| Other values (4305) | 9592 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 6030 | 9.4% |
| a | 5847 | 9.1% |
| 5237 | 8.2% | |
| n | 4474 | 7.0% |
| r | 4079 | 6.4% |
| i | 3867 | 6.0% |
| o | 3490 | 5.5% |
| l | 3413 | 5.3% |
| t | 2311 | 3.6% |
| s | 2265 | 3.5% |
| Other values (71) | 22982 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 48035 | |
| Uppercase Letter | 10417 | 16.3% |
| Space Separator | 5237 | 8.2% |
| Other Punctuation | 226 | 0.4% |
| Dash Punctuation | 78 | 0.1% |
| Decimal Number | 2 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 6030 | |
| a | 5847 | |
| n | 4474 | |
| r | 4079 | 8.5% |
| i | 3867 | 8.1% |
| o | 3490 | 7.3% |
| l | 3413 | 7.1% |
| t | 2311 | 4.8% |
| s | 2265 | 4.7% |
| h | 1810 | 3.8% |
| Other values (34) | 10449 |
| Value | Count | Frequency (%) |
| M | 953 | 9.1% |
| S | 815 | 7.8% |
| J | 810 | 7.8% |
| B | 781 | 7.5% |
| C | 774 | 7.4% |
| D | 635 | 6.1% |
| R | 602 | 5.8% |
| A | 568 | 5.5% |
| L | 523 | 5.0% |
| K | 454 | 4.4% |
| Other values (21) | 3502 |
| Value | Count | Frequency (%) |
| . | 163 | |
| ' | 63 | 27.9% |
| Value | Count | Frequency (%) |
| 5 | 1 | |
| 0 | 1 |
| Value | Count | Frequency (%) |
| 5237 |
| Value | Count | Frequency (%) |
| - | 78 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 58452 | |
| Common | 5543 | 8.7% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 6030 | 10.3% |
| a | 5847 | 10.0% |
| n | 4474 | 7.7% |
| r | 4079 | 7.0% |
| i | 3867 | 6.6% |
| o | 3490 | 6.0% |
| l | 3413 | 5.8% |
| t | 2311 | 4.0% |
| s | 2265 | 3.9% |
| h | 1810 | 3.1% |
| Other values (65) | 20866 |
| Value | Count | Frequency (%) |
| 5237 | ||
| . | 163 | 2.9% |
| - | 78 | 1.4% |
| ' | 63 | 1.1% |
| 5 | 1 | < 0.1% |
| 0 | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 63862 | |
| None | 133 | 0.2% |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 6030 | 9.4% |
| a | 5847 | 9.2% |
| 5237 | 8.2% | |
| n | 4474 | 7.0% |
| r | 4079 | 6.4% |
| i | 3867 | 6.1% |
| o | 3490 | 5.5% |
| l | 3413 | 5.3% |
| t | 2311 | 3.6% |
| s | 2265 | 3.5% |
| Other values (48) | 22849 |
| Value | Count | Frequency (%) |
| é | 48 | |
| í | 14 | 10.5% |
| á | 13 | 9.8% |
| ó | 9 | 6.8% |
| ë | 7 | 5.3% |
| ü | 7 | 5.3% |
| à | 5 | 3.8% |
| è | 4 | 3.0% |
| ç | 3 | 2.3% |
| ö | 3 | 2.3% |
| Other values (13) | 20 |
| Distinct | 19 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 13 |
| Missing (%) | 0.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.377320008 |
|---|---|
| Minimum | 0 |
| Maximum | 43 |
| Zeros | 2089 |
| Zeros (%) | 42.5% |
| Memory size | 38.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 5 |
| Maximum | 43 |
| Range | 43 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 2.023825749 |
|---|---|
| Coefficient of variation (CV) | 1.469393995 |
| Kurtosis | 52.2146141 |
| Mean | 1.377320008 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 4.405495913 |
| Sum | 6753 |
| Variance | 4.095870662 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2089 | |
| 1 | 1224 | |
| 2 | 702 | 14.3% |
| 3 | 369 | 7.5% |
| 4 | 198 | 4.0% |
| 5 | 113 | 2.3% |
| 6 | 75 | 1.5% |
| 7 | 48 | 1.0% |
| 8 | 37 | 0.8% |
| 9 | 17 | 0.3% |
| Other values (9) | 31 | 0.6% |
| (Missing) | 13 | 0.3% |
| Value | Count | Frequency (%) |
| 0 | 2089 | |
| 1 | 1224 | |
| 2 | 702 | 14.3% |
| 3 | 369 | 7.5% |
| 4 | 198 | 4.0% |
| Value | Count | Frequency (%) |
| 43 | 1 | < 0.1% |
| 31 | 1 | < 0.1% |
| 19 | 1 | < 0.1% |
| 15 | 6 | |
| 14 | 1 | < 0.1% |
| Distinct | 4756 |
|---|---|
| Distinct (%) | 99.8% |
| Missing | 152 |
| Missing (%) | 3.1% |
| Memory size | 38.5 KiB |
| based on novel | 4 |
|---|---|
| one word title | 3 |
| two word title | 2 |
| after dark horrorfest | 2 |
| color in title | 2 |
| Other values (4751) |
Length
| Max length | 149 |
|---|---|
| Median length | 50 |
| Mean length | 52.44542401 |
| Min length | 2 |
Characters and Unicode
| Total characters | 249850 |
|---|---|
| Distinct characters | 42 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 4751 ? |
|---|---|
| Unique (%) | 99.7% |
Sample
| 1st row | avatar|future|marine|native|paraplegic |
|---|---|
| 2nd row | goddess|marriage ceremony|marriage proposal|pirate|singapore |
| 3rd row | bomb|espionage|sequel|spy|terrorist |
| 4th row | deception|imprisonment|lawlessness|police officer|terrorist plot |
| 5th row | alien|american civil war|male nipple|mars|princess |
| Value | Count | Frequency (%) |
| based on novel | 4 | 0.1% |
| one word title | 3 | 0.1% |
| two word title | 2 | < 0.1% |
| after dark horrorfest | 2 | < 0.1% |
| color in title | 2 | < 0.1% |
| dragon|island|training|viking|village | 1 | < 0.1% |
| box office flop|hawaii|naval|oahu hawaii|ship | 1 | < 0.1% |
| island|sailor|storm|stranded|vacation | 1 | < 0.1% |
| 1970s|female rear nudity|formula 1|rivalry|sex with a nurse | 1 | < 0.1% |
| ash|father|mother|pokemon|professor | 1 | < 0.1% |
| Other values (4746) | 4746 | |
| (Missing) | 152 | 3.1% |
| Value | Count | Frequency (%) |
| in | 324 | 1.8% |
| of | 215 | 1.2% |
| on | 208 | 1.2% |
| the | 187 | 1.1% |
| a | 178 | 1.0% |
| to | 174 | 1.0% |
| york | 122 | 0.7% |
| based | 105 | 0.6% |
| female | 104 | 0.6% |
| by | 97 | 0.6% |
| Other values (11479) | 15863 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 24178 | 9.7% |
| a | 19059 | 7.6% |
| | | 18714 | 7.5% |
| i | 18265 | 7.3% |
| r | 17649 | 7.1% |
| t | 15796 | 6.3% |
| n | 15281 | 6.1% |
| o | 15103 | 6.0% |
| s | 12955 | 5.2% |
| 12813 | 5.1% | |
| Other values (32) | 80037 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 217006 | |
| Math Symbol | 18714 | 7.5% |
| Space Separator | 12813 | 5.1% |
| Decimal Number | 1099 | 0.4% |
| Other Punctuation | 216 | 0.1% |
| Open Punctuation | 1 | < 0.1% |
| Close Punctuation | 1 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 24178 | |
| a | 19059 | 8.8% |
| i | 18265 | 8.4% |
| r | 17649 | 8.1% |
| t | 15796 | 7.3% |
| n | 15281 | 7.0% |
| o | 15103 | 7.0% |
| s | 12955 | 6.0% |
| l | 10874 | 5.0% |
| c | 9222 | 4.2% |
| Other values (16) | 58624 |
| Value | Count | Frequency (%) |
| 1 | 276 | |
| 0 | 264 | |
| 9 | 215 | |
| 2 | 79 | 7.2% |
| 8 | 61 | 5.6% |
| 5 | 47 | 4.3% |
| 7 | 46 | 4.2% |
| 3 | 44 | 4.0% |
| 6 | 38 | 3.5% |
| 4 | 29 | 2.6% |
| Value | Count | Frequency (%) |
| . | 128 | |
| ' | 88 |
| Value | Count | Frequency (%) |
| | | 18714 |
| Value | Count | Frequency (%) |
| 12813 |
| Value | Count | Frequency (%) |
| ( | 1 |
| Value | Count | Frequency (%) |
| ) | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 217006 | |
| Common | 32844 | 13.1% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 24178 | |
| a | 19059 | 8.8% |
| i | 18265 | 8.4% |
| r | 17649 | 8.1% |
| t | 15796 | 7.3% |
| n | 15281 | 7.0% |
| o | 15103 | 7.0% |
| s | 12955 | 6.0% |
| l | 10874 | 5.0% |
| c | 9222 | 4.2% |
| Other values (16) | 58624 |
| Value | Count | Frequency (%) |
| | | 18714 | |
| 12813 | ||
| 1 | 276 | 0.8% |
| 0 | 264 | 0.8% |
| 9 | 215 | 0.7% |
| . | 128 | 0.4% |
| ' | 88 | 0.3% |
| 2 | 79 | 0.2% |
| 8 | 61 | 0.2% |
| 5 | 47 | 0.1% |
| Other values (6) | 159 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 249850 |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 24178 | 9.7% |
| a | 19059 | 7.6% |
| | | 18714 | 7.5% |
| i | 18265 | 7.3% |
| r | 17649 | 7.1% |
| t | 15796 | 6.3% |
| n | 15281 | 6.1% |
| o | 15103 | 6.0% |
| s | 12955 | 5.2% |
| 12813 | 5.1% | |
| Other values (32) | 80037 |
| Distinct | 4916 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 38.5 KiB |
| http://www.imdb.com/title/tt0385004/?ref_=fn_tt_tt_1 | 1 |
|---|---|
| http://www.imdb.com/title/tt0139239/?ref_=fn_tt_tt_1 | 1 |
| http://www.imdb.com/title/tt0099587/?ref_=fn_tt_tt_1 | 1 |
| http://www.imdb.com/title/tt0091635/?ref_=fn_tt_tt_1 | 1 |
| http://www.imdb.com/title/tt0080749/?ref_=fn_tt_tt_1 | 1 |
| Other values (4911) |
Length
| Max length | 52 |
|---|---|
| Median length | 52 |
| Mean length | 52 |
| Min length | 52 |
Characters and Unicode
| Total characters | 255632 |
|---|---|
| Distinct characters | 31 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 4916 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | http://www.imdb.com/title/tt0499549/?ref_=fn_tt_tt_1 |
|---|---|
| 2nd row | http://www.imdb.com/title/tt0449088/?ref_=fn_tt_tt_1 |
| 3rd row | http://www.imdb.com/title/tt2379713/?ref_=fn_tt_tt_1 |
| 4th row | http://www.imdb.com/title/tt1345836/?ref_=fn_tt_tt_1 |
| 5th row | http://www.imdb.com/title/tt5289954/?ref_=fn_tt_tt_1 |
| Value | Count | Frequency (%) |
| http://www.imdb.com/title/tt0385004/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt0139239/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt0099587/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt0091635/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt0080749/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt0380599/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt1091191/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt0923600/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt0861689/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt0360717/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| Other values (4906) | 4906 |
| Value | Count | Frequency (%) |
| http://www.imdb.com/title/tt0385004/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt0139239/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt0099587/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt0091635/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt0080749/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt0380599/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt1091191/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt0923600/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt0861689/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| http://www.imdb.com/title/tt0360717/?ref_=fn_tt_tt_1 | 1 | < 0.1% |
| Other values (4906) | 4906 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 49160 | |
| / | 24580 | 9.6% |
| _ | 19664 | 7.7% |
| w | 14748 | 5.8% |
| . | 9832 | 3.8% |
| i | 9832 | 3.8% |
| m | 9832 | 3.8% |
| e | 9832 | 3.8% |
| f | 9832 | 3.8% |
| 1 | 9667 | 3.8% |
| Other values (21) | 88653 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 147480 | |
| Other Punctuation | 44244 | 17.3% |
| Decimal Number | 39328 | 15.4% |
| Connector Punctuation | 19664 | 7.7% |
| Math Symbol | 4916 | 1.9% |
Most frequent character per category
| Value | Count | Frequency (%) |
| t | 49160 | |
| w | 14748 | 10.0% |
| i | 9832 | 6.7% |
| m | 9832 | 6.7% |
| e | 9832 | 6.7% |
| f | 9832 | 6.7% |
| h | 4916 | 3.3% |
| p | 4916 | 3.3% |
| d | 4916 | 3.3% |
| b | 4916 | 3.3% |
| Other values (5) | 24580 |
| Value | Count | Frequency (%) |
| 1 | 9667 | |
| 0 | 6632 | |
| 2 | 3570 | 9.1% |
| 3 | 3158 | 8.0% |
| 4 | 3093 | 7.9% |
| 8 | 2848 | 7.2% |
| 6 | 2655 | 6.8% |
| 9 | 2652 | 6.7% |
| 7 | 2624 | 6.7% |
| 5 | 2429 | 6.2% |
| Value | Count | Frequency (%) |
| / | 24580 | |
| . | 9832 | 22.2% |
| : | 4916 | 11.1% |
| ? | 4916 | 11.1% |
| Value | Count | Frequency (%) |
| _ | 19664 |
| Value | Count | Frequency (%) |
| = | 4916 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 147480 | |
| Common | 108152 |
Most frequent character per script
| Value | Count | Frequency (%) |
| / | 24580 | |
| _ | 19664 | |
| . | 9832 | 9.1% |
| 1 | 9667 | 8.9% |
| 0 | 6632 | 6.1% |
| : | 4916 | 4.5% |
| ? | 4916 | 4.5% |
| = | 4916 | 4.5% |
| 2 | 3570 | 3.3% |
| 3 | 3158 | 2.9% |
| Other values (6) | 16301 |
| Value | Count | Frequency (%) |
| t | 49160 | |
| w | 14748 | 10.0% |
| i | 9832 | 6.7% |
| m | 9832 | 6.7% |
| e | 9832 | 6.7% |
| f | 9832 | 6.7% |
| h | 4916 | 3.3% |
| p | 4916 | 3.3% |
| d | 4916 | 3.3% |
| b | 4916 | 3.3% |
| Other values (5) | 24580 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 255632 |
Most frequent character per block
| Value | Count | Frequency (%) |
| t | 49160 | |
| / | 24580 | 9.6% |
| _ | 19664 | 7.7% |
| w | 14748 | 5.8% |
| . | 9832 | 3.8% |
| i | 9832 | 3.8% |
| m | 9832 | 3.8% |
| e | 9832 | 3.8% |
| f | 9832 | 3.8% |
| 1 | 9667 | 3.8% |
| Other values (21) | 88653 |
num_user_for_reviews
Real number (ℝ≥0)
| Distinct | 954 |
|---|---|
| Distinct (%) | 19.5% |
| Missing | 21 |
| Missing (%) | 0.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 267.6688458 |
|---|---|
| Minimum | 1 |
| Maximum | 5060 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 9 |
| Q1 | 64 |
| median | 153 |
| Q3 | 320.5 |
| 95-th percentile | 889.3 |
| Maximum | 5060 |
| Range | 5059 |
| Interquartile range (IQR) | 256.5 |
Descriptive statistics
| Standard deviation | 372.9348388 |
|---|---|
| Coefficient of variation (CV) | 1.3932695 |
| Kurtosis | 28.0147829 |
| Mean | 267.6688458 |
| Median Absolute Deviation (MAD) | 111 |
| Skewness | 4.227610312 |
| Sum | 1310239 |
| Variance | 139080.394 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 49 | 1.0% |
| 3 | 33 | 0.7% |
| 26 | 32 | 0.7% |
| 2 | 32 | 0.7% |
| 10 | 29 | 0.6% |
| 6 | 28 | 0.6% |
| 50 | 26 | 0.5% |
| 8 | 25 | 0.5% |
| 32 | 25 | 0.5% |
| 31 | 24 | 0.5% |
| Other values (944) | 4592 |
| Value | Count | Frequency (%) |
| 1 | 49 | |
| 2 | 32 | |
| 3 | 33 | |
| 4 | 23 | |
| 5 | 19 | 0.4% |
| Value | Count | Frequency (%) |
| 5060 | 1 | |
| 4667 | 1 | |
| 4144 | 1 | |
| 3646 | 1 | |
| 3597 | 1 |
language
Categorical
| Distinct | 47 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 12 |
| Missing (%) | 0.2% |
| Memory size | 38.5 KiB |
| English | |
|---|---|
| French | 73 |
| Spanish | 40 |
| Hindi | 28 |
| Mandarin | 24 |
| Other values (42) | 157 |
Length
| Max length | 10 |
|---|---|
| Median length | 7 |
| Mean length | 6.980016313 |
| Min length | 4 |
Characters and Unicode
| Total characters | 34230 |
|---|---|
| Distinct characters | 43 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 18 ? |
|---|---|
| Unique (%) | 0.4% |
Sample
| 1st row | English |
|---|---|
| 2nd row | English |
| 3rd row | English |
| 4th row | English |
| 5th row | English |
| Value | Count | Frequency (%) |
| English | 4582 | |
| French | 73 | 1.5% |
| Spanish | 40 | 0.8% |
| Hindi | 28 | 0.6% |
| Mandarin | 24 | 0.5% |
| German | 19 | 0.4% |
| Japanese | 17 | 0.3% |
| Russian | 11 | 0.2% |
| Cantonese | 11 | 0.2% |
| Italian | 11 | 0.2% |
| Other values (37) | 88 | 1.8% |
| (Missing) | 12 | 0.2% |
| Value | Count | Frequency (%) |
| english | 4582 | |
| french | 73 | 1.5% |
| spanish | 40 | 0.8% |
| hindi | 28 | 0.6% |
| mandarin | 24 | 0.5% |
| german | 19 | 0.4% |
| japanese | 17 | 0.3% |
| italian | 11 | 0.2% |
| russian | 11 | 0.2% |
| cantonese | 11 | 0.2% |
| Other values (37) | 88 | 1.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 4904 | |
| i | 4781 | |
| h | 4722 | |
| s | 4704 | |
| l | 4608 | |
| g | 4600 | |
| E | 4582 | |
| a | 245 | 0.7% |
| e | 214 | 0.6% |
| r | 157 | 0.5% |
| Other values (33) | 713 | 2.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 29326 | |
| Uppercase Letter | 4904 | 14.3% |
Most frequent character per category
| Value | Count | Frequency (%) |
| n | 4904 | |
| i | 4781 | |
| h | 4722 | |
| s | 4704 | |
| l | 4608 | |
| g | 4600 | |
| a | 245 | 0.8% |
| e | 214 | 0.7% |
| r | 157 | 0.5% |
| c | 88 | 0.3% |
| Other values (13) | 303 | 1.0% |
| Value | Count | Frequency (%) |
| E | 4582 | |
| F | 74 | 1.5% |
| S | 47 | 1.0% |
| H | 34 | 0.7% |
| M | 26 | 0.5% |
| G | 20 | 0.4% |
| J | 17 | 0.3% |
| P | 16 | 0.3% |
| C | 15 | 0.3% |
| I | 15 | 0.3% |
| Other values (10) | 58 | 1.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 34230 |
Most frequent character per script
| Value | Count | Frequency (%) |
| n | 4904 | |
| i | 4781 | |
| h | 4722 | |
| s | 4704 | |
| l | 4608 | |
| g | 4600 | |
| E | 4582 | |
| a | 245 | 0.7% |
| e | 214 | 0.6% |
| r | 157 | 0.5% |
| Other values (33) | 713 | 2.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 34230 |
Most frequent character per block
| Value | Count | Frequency (%) |
| n | 4904 | |
| i | 4781 | |
| h | 4722 | |
| s | 4704 | |
| l | 4608 | |
| g | 4600 | |
| E | 4582 | |
| a | 245 | 0.7% |
| e | 214 | 0.6% |
| r | 157 | 0.5% |
| Other values (33) | 713 | 2.1% |
| Distinct | 65 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 5 |
| Missing (%) | 0.1% |
| Memory size | 38.5 KiB |
| USA | |
|---|---|
| UK | |
| France | 154 |
| Canada | 124 |
| Germany | 94 |
| Other values (60) |
Length
| Max length | 20 |
|---|---|
| Median length | 3 |
| Mean length | 3.489513337 |
| Min length | 2 |
Characters and Unicode
| Total characters | 17137 |
|---|---|
| Distinct characters | 47 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 28 ? |
|---|---|
| Unique (%) | 0.6% |
Sample
| 1st row | USA |
|---|---|
| 2nd row | USA |
| 3rd row | UK |
| 4th row | USA |
| 5th row | USA |
| Value | Count | Frequency (%) |
| USA | 3710 | |
| UK | 434 | 8.8% |
| France | 154 | 3.1% |
| Canada | 124 | 2.5% |
| Germany | 94 | 1.9% |
| Australia | 53 | 1.1% |
| India | 34 | 0.7% |
| Spain | 33 | 0.7% |
| China | 28 | 0.6% |
| Italy | 23 | 0.5% |
| Other values (55) | 224 | 4.6% |
| Value | Count | Frequency (%) |
| usa | 3710 | |
| uk | 434 | 8.7% |
| france | 154 | 3.1% |
| canada | 124 | 2.5% |
| germany | 97 | 2.0% |
| australia | 53 | 1.1% |
| india | 34 | 0.7% |
| spain | 33 | 0.7% |
| china | 28 | 0.6% |
| italy | 23 | 0.5% |
| Other values (63) | 283 | 5.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| U | 4146 | |
| A | 3778 | |
| S | 3776 | |
| a | 1068 | 6.2% |
| n | 626 | 3.7% |
| K | 466 | 2.7% |
| e | 399 | 2.3% |
| r | 398 | 2.3% |
| i | 244 | 1.4% |
| d | 212 | 1.2% |
| Other values (37) | 2024 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 12826 | |
| Lowercase Letter | 4249 | 24.8% |
| Space Separator | 62 | 0.4% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 1068 | |
| n | 626 | |
| e | 399 | 9.4% |
| r | 398 | 9.4% |
| i | 244 | 5.7% |
| d | 212 | 5.0% |
| c | 193 | 4.5% |
| l | 147 | 3.5% |
| y | 136 | 3.2% |
| m | 122 | 2.9% |
| Other values (14) | 704 |
| Value | Count | Frequency (%) |
| U | 4146 | |
| A | 3778 | |
| S | 3776 | |
| K | 466 | 3.6% |
| C | 159 | 1.2% |
| F | 155 | 1.2% |
| G | 100 | 0.8% |
| I | 81 | 0.6% |
| N | 27 | 0.2% |
| J | 22 | 0.2% |
| Other values (12) | 116 | 0.9% |
| Value | Count | Frequency (%) |
| 62 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 17075 | |
| Common | 62 | 0.4% |
Most frequent character per script
| Value | Count | Frequency (%) |
| U | 4146 | |
| A | 3778 | |
| S | 3776 | |
| a | 1068 | 6.3% |
| n | 626 | 3.7% |
| K | 466 | 2.7% |
| e | 399 | 2.3% |
| r | 398 | 2.3% |
| i | 244 | 1.4% |
| d | 212 | 1.2% |
| Other values (36) | 1962 |
| Value | Count | Frequency (%) |
| 62 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 17137 |
Most frequent character per block
| Value | Count | Frequency (%) |
| U | 4146 | |
| A | 3778 | |
| S | 3776 | |
| a | 1068 | 6.2% |
| n | 626 | 3.7% |
| K | 466 | 2.7% |
| e | 399 | 2.3% |
| r | 398 | 2.3% |
| i | 244 | 1.4% |
| d | 212 | 1.2% |
| Other values (37) | 2024 |
| Distinct | 18 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 300 |
| Missing (%) | 6.1% |
| Memory size | 38.5 KiB |
| R | |
|---|---|
| PG-13 | |
| PG | |
| Not Rated | 115 |
| G | 112 |
| Other values (13) |
Length
| Max length | 9 |
|---|---|
| Median length | 2 |
| Mean length | 2.807192374 |
| Min length | 1 |
Characters and Unicode
| Total characters | 12958 |
|---|---|
| Distinct characters | 28 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | PG-13 |
|---|---|
| 2nd row | PG-13 |
| 3rd row | PG-13 |
| 4th row | PG-13 |
| 5th row | PG-13 |
| Value | Count | Frequency (%) |
| R | 2067 | |
| PG-13 | 1411 | |
| PG | 686 | 14.0% |
| Not Rated | 115 | 2.3% |
| G | 112 | 2.3% |
| Unrated | 59 | 1.2% |
| Approved | 54 | 1.1% |
| TV-14 | 30 | 0.6% |
| TV-MA | 18 | 0.4% |
| TV-PG | 13 | 0.3% |
| Other values (8) | 51 | 1.0% |
| (Missing) | 300 | 6.1% |
| Value | Count | Frequency (%) |
| r | 2067 | |
| pg-13 | 1411 | |
| pg | 686 | 14.5% |
| rated | 115 | 2.4% |
| not | 115 | 2.4% |
| g | 112 | 2.4% |
| unrated | 59 | 1.2% |
| approved | 54 | 1.1% |
| tv-14 | 30 | 0.6% |
| tv-ma | 18 | 0.4% |
| Other values (9) | 64 | 1.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| G | 2238 | |
| R | 2182 | |
| P | 2125 | |
| - | 1491 | |
| 1 | 1448 | |
| 3 | 1411 | |
| t | 289 | 2.2% |
| e | 237 | 1.8% |
| d | 237 | 1.8% |
| a | 183 | 1.4% |
| Other values (18) | 1117 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 6988 | |
| Decimal Number | 2897 | |
| Dash Punctuation | 1491 | 11.5% |
| Lowercase Letter | 1467 | 11.3% |
| Space Separator | 115 | 0.9% |
Most frequent character per category
| Value | Count | Frequency (%) |
| G | 2238 | |
| R | 2182 | |
| P | 2125 | |
| N | 122 | 1.7% |
| T | 73 | 1.0% |
| V | 73 | 1.0% |
| A | 72 | 1.0% |
| U | 59 | 0.8% |
| M | 23 | 0.3% |
| X | 12 | 0.2% |
| Other values (2) | 9 | 0.1% |
| Value | Count | Frequency (%) |
| t | 289 | |
| e | 237 | |
| d | 237 | |
| a | 183 | |
| o | 169 | |
| r | 113 | 7.7% |
| p | 108 | 7.4% |
| n | 59 | 4.0% |
| v | 54 | 3.7% |
| s | 18 | 1.2% |
| Value | Count | Frequency (%) |
| 1 | 1448 | |
| 3 | 1411 | |
| 4 | 30 | 1.0% |
| 7 | 8 | 0.3% |
| Value | Count | Frequency (%) |
| - | 1491 |
| Value | Count | Frequency (%) |
| 115 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 8455 | |
| Common | 4503 |
Most frequent character per script
| Value | Count | Frequency (%) |
| G | 2238 | |
| R | 2182 | |
| P | 2125 | |
| t | 289 | 3.4% |
| e | 237 | 2.8% |
| d | 237 | 2.8% |
| a | 183 | 2.2% |
| o | 169 | 2.0% |
| N | 122 | 1.4% |
| r | 113 | 1.3% |
| Other values (12) | 560 | 6.6% |
| Value | Count | Frequency (%) |
| - | 1491 | |
| 1 | 1448 | |
| 3 | 1411 | |
| 115 | 2.6% | |
| 4 | 30 | 0.7% |
| 7 | 8 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 12958 |
Most frequent character per block
| Value | Count | Frequency (%) |
| G | 2238 | |
| R | 2182 | |
| P | 2125 | |
| - | 1491 | |
| 1 | 1448 | |
| 3 | 1411 | |
| t | 289 | 2.2% |
| e | 237 | 1.8% |
| d | 237 | 1.8% |
| a | 183 | 1.4% |
| Other values (18) | 1117 |
| Distinct | 438 |
|---|---|
| Distinct (%) | 9.9% |
| Missing | 484 |
| Missing (%) | 9.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 36547486.03 |
|---|---|
| Minimum | 218 |
| Maximum | 4200000000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.5 KiB |
Quantile statistics
| Minimum | 218 |
|---|---|
| 5-th percentile | 500000 |
| Q1 | 6000000 |
| median | 19850000 |
| Q3 | 43000000 |
| 95-th percentile | 125000000 |
| Maximum | 4200000000 |
| Range | 4199999782 |
| Interquartile range (IQR) | 37000000 |
Descriptive statistics
| Standard deviation | 100242679.2 |
|---|---|
| Coefficient of variation (CV) | 2.742806418 |
| Kurtosis | 870.8894003 |
| Mean | 36547486.03 |
| Median Absolute Deviation (MAD) | 15850000 |
| Skewness | 25.36637236 |
| Sum | 1.619784581 × 1011 |
| Variance | 1.004859474 × 1016 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 20000000 | 168 | 3.4% |
| 25000000 | 139 | 2.8% |
| 15000000 | 139 | 2.8% |
| 30000000 | 136 | 2.8% |
| 10000000 | 133 | 2.7% |
| 40000000 | 129 | 2.6% |
| 35000000 | 117 | 2.4% |
| 5000000 | 108 | 2.2% |
| 50000000 | 99 | 2.0% |
| 12000000 | 91 | 1.9% |
| Other values (428) | 3173 | |
| (Missing) | 484 | 9.8% |
| Value | Count | Frequency (%) |
| 218 | 1 | |
| 1100 | 1 | |
| 1400 | 1 | |
| 3250 | 1 | |
| 4500 | 1 |
| Value | Count | Frequency (%) |
| 4200000000 | 1 | |
| 2500000000 | 1 | |
| 2400000000 | 1 | |
| 2127519898 | 1 | |
| 1100000000 | 1 |
| Distinct | 91 |
|---|---|
| Distinct (%) | 1.9% |
| Missing | 106 |
| Missing (%) | 2.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2002.447609 |
|---|---|
| Minimum | 1916 |
| Maximum | 2016 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.5 KiB |
Quantile statistics
| Minimum | 1916 |
|---|---|
| 5-th percentile | 1979 |
| Q1 | 1999 |
| median | 2005 |
| Q3 | 2011 |
| 95-th percentile | 2015 |
| Maximum | 2016 |
| Range | 100 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 12.45397681 |
|---|---|
| Coefficient of variation (CV) | 0.0062193771 |
| Kurtosis | 7.630278079 |
| Mean | 2002.447609 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | -2.320339567 |
| Sum | 9631773 |
| Variance | 155.1015383 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 2009 | 253 | 5.1% |
| 2014 | 243 | 4.9% |
| 2006 | 233 | 4.7% |
| 2013 | 231 | 4.7% |
| 2010 | 225 | 4.6% |
| 2011 | 224 | 4.6% |
| 2008 | 223 | 4.5% |
| 2005 | 216 | 4.4% |
| 2012 | 214 | 4.4% |
| 2015 | 211 | 4.3% |
| Other values (81) | 2537 |
| Value | Count | Frequency (%) |
| 1916 | 1 | |
| 1920 | 1 | |
| 1925 | 1 | |
| 1927 | 1 | |
| 1929 | 2 |
| Value | Count | Frequency (%) |
| 2016 | 98 | |
| 2015 | 211 | |
| 2014 | 243 | |
| 2013 | 231 | |
| 2012 | 214 |
| Distinct | 917 |
|---|---|
| Distinct (%) | 18.7% |
| Missing | 13 |
| Missing (%) | 0.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1621.923516 |
|---|---|
| Minimum | 0 |
| Maximum | 137000 |
| Zeros | 55 |
| Zeros (%) | 1.1% |
| Memory size | 38.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 25 |
| Q1 | 277 |
| median | 593 |
| Q3 | 912 |
| 95-th percentile | 11000 |
| Maximum | 137000 |
| Range | 137000 |
| Interquartile range (IQR) | 635 |
Descriptive statistics
| Standard deviation | 4011.299523 |
|---|---|
| Coefficient of variation (CV) | 2.473174279 |
| Kurtosis | 271.6032173 |
| Mean | 1621.923516 |
| Median Absolute Deviation (MAD) | 318 |
| Skewness | 10.25322055 |
| Sum | 7952291 |
| Variance | 16090523.86 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 1000 | 294 | 6.0% |
| 11000 | 106 | 2.2% |
| 2000 | 97 | 2.0% |
| 3000 | 73 | 1.5% |
| 0 | 55 | 1.1% |
| 10000 | 45 | 0.9% |
| 13000 | 39 | 0.8% |
| 14000 | 38 | 0.8% |
| 826 | 35 | 0.7% |
| 4000 | 33 | 0.7% |
| Other values (907) | 4088 |
| Value | Count | Frequency (%) |
| 0 | 55 | |
| 2 | 14 | 0.3% |
| 3 | 13 | 0.3% |
| 4 | 11 | 0.2% |
| 5 | 10 | 0.2% |
| Value | Count | Frequency (%) |
| 137000 | 1 | < 0.1% |
| 29000 | 1 | < 0.1% |
| 27000 | 2 | < 0.1% |
| 25000 | 2 | < 0.1% |
| 23000 | 6 |
imdb_score
Real number (ℝ≥0)
| Distinct | 78 |
|---|---|
| Distinct (%) | 1.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.437428804 |
|---|---|
| Minimum | 1.6 |
| Maximum | 9.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.5 KiB |
Quantile statistics
| Minimum | 1.6 |
|---|---|
| 5-th percentile | 4.3 |
| Q1 | 5.8 |
| median | 6.6 |
| Q3 | 7.2 |
| 95-th percentile | 8.1 |
| Maximum | 9.5 |
| Range | 7.9 |
| Interquartile range (IQR) | 1.4 |
Descriptive statistics
| Standard deviation | 1.127802092 |
|---|---|
| Coefficient of variation (CV) | 0.1751944955 |
| Kurtosis | 0.9292585715 |
| Mean | 6.437428804 |
| Median Absolute Deviation (MAD) | 0.7 |
| Skewness | -0.7404080151 |
| Sum | 31646.4 |
| Variance | 1.271937559 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 6.7 | 216 | 4.4% |
| 6.6 | 199 | 4.0% |
| 7.2 | 187 | 3.8% |
| 6.4 | 183 | 3.7% |
| 6.5 | 182 | 3.7% |
| 7.3 | 180 | 3.7% |
| 6.8 | 178 | 3.6% |
| 7.1 | 177 | 3.6% |
| 7 | 177 | 3.6% |
| 6.3 | 175 | 3.6% |
| Other values (68) | 3062 |
| Value | Count | Frequency (%) |
| 1.6 | 1 | < 0.1% |
| 1.7 | 1 | < 0.1% |
| 1.9 | 3 | |
| 2 | 2 | |
| 2.1 | 3 |
| Value | Count | Frequency (%) |
| 9.5 | 1 | < 0.1% |
| 9.3 | 1 | < 0.1% |
| 9.2 | 1 | < 0.1% |
| 9.1 | 2 | |
| 9 | 3 |
| Distinct | 22 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 326 |
| Missing (%) | 6.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.222348584 |
|---|---|
| Minimum | 1.18 |
| Maximum | 16 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 38.5 KiB |
Quantile statistics
| Minimum | 1.18 |
|---|---|
| 5-th percentile | 1.66 |
| Q1 | 1.85 |
| median | 2.35 |
| Q3 | 2.35 |
| 95-th percentile | 2.35 |
| Maximum | 16 |
| Range | 14.82 |
| Interquartile range (IQR) | 0.5 |
Descriptive statistics
| Standard deviation | 1.402939811 |
|---|---|
| Coefficient of variation (CV) | 0.6312870183 |
| Kurtosis | 88.33874594 |
| Mean | 2.222348584 |
| Median Absolute Deviation (MAD) | 0.04 |
| Skewness | 9.277083835 |
| Sum | 10200.58 |
| Variance | 1.968240114 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 2.35 | 2283 | |
| 1.85 | 1866 | |
| 1.78 | 108 | 2.2% |
| 1.37 | 99 | 2.0% |
| 1.33 | 66 | 1.3% |
| 1.66 | 63 | 1.3% |
| 16 | 45 | 0.9% |
| 2.39 | 15 | 0.3% |
| 2.2 | 14 | 0.3% |
| 4 | 7 | 0.1% |
| Other values (12) | 24 | 0.5% |
| (Missing) | 326 | 6.6% |
| Value | Count | Frequency (%) |
| 1.18 | 1 | < 0.1% |
| 1.2 | 1 | < 0.1% |
| 1.33 | 66 | |
| 1.37 | 99 | |
| 1.44 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 16 | 45 | |
| 4 | 7 | 0.1% |
| 2.76 | 3 | 0.1% |
| 2.55 | 2 | < 0.1% |
| 2.4 | 3 | 0.1% |
| Distinct | 876 |
|---|---|
| Distinct (%) | 17.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7348.294142 |
|---|---|
| Minimum | 0 |
| Maximum | 349000 |
| Zeros | 2130 |
| Zeros (%) | 43.3% |
| Memory size | 38.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 159 |
| Q3 | 2000 |
| 95-th percentile | 40000 |
| Maximum | 349000 |
| Range | 349000 |
| Interquartile range (IQR) | 2000 |
Descriptive statistics
| Standard deviation | 19206.01646 |
|---|---|
| Coefficient of variation (CV) | 2.613670069 |
| Kurtosis | 43.14957809 |
| Mean | 7348.294142 |
| Median Absolute Deviation (MAD) | 159 |
| Skewness | 5.177099068 |
| Sum | 36124214 |
| Variance | 368871068.2 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2130 | |
| 1000 | 108 | 2.2% |
| 11000 | 79 | 1.6% |
| 10000 | 76 | 1.5% |
| 13000 | 58 | 1.2% |
| 12000 | 58 | 1.2% |
| 2000 | 56 | 1.1% |
| 15000 | 48 | 1.0% |
| 14000 | 47 | 1.0% |
| 16000 | 46 | 0.9% |
| Other values (866) | 2210 |
| Value | Count | Frequency (%) |
| 0 | 2130 | |
| 2 | 2 | < 0.1% |
| 3 | 1 | < 0.1% |
| 4 | 5 | 0.1% |
| 5 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 349000 | 1 | |
| 199000 | 1 | |
| 197000 | 1 | |
| 191000 | 1 | |
| 190000 | 1 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| movie_title | color | director_name | num_critic_for_reviews | duration | director_facebook_likes | actor_3_facebook_likes | actor_2_name | actor_1_facebook_likes | gross | genres | actor_1_name | num_voted_users | cast_total_facebook_likes | actor_3_name | facenumber_in_poster | plot_keywords | movie_imdb_link | num_user_for_reviews | language | country | content_rating | budget | title_year | actor_2_facebook_likes | imdb_score | aspect_ratio | movie_facebook_likes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Avatar | Color | James Cameron | 723.0 | 178.0 | 0.0 | 855.0 | Joel David Moore | 1000.0 | 760505847.0 | Action|Adventure|Fantasy|Sci-Fi | CCH Pounder | 886204 | 4834 | Wes Studi | 0.0 | avatar|future|marine|native|paraplegic | http://www.imdb.com/title/tt0499549/?ref_=fn_tt_tt_1 | 3054.0 | English | USA | PG-13 | 237000000.0 | 2009.0 | 936.0 | 7.9 | 1.78 | 33000 |
| 1 | Pirates of the Caribbean: At World's End | Color | Gore Verbinski | 302.0 | 169.0 | 563.0 | 1000.0 | Orlando Bloom | 40000.0 | 309404152.0 | Action|Adventure|Fantasy | Johnny Depp | 471220 | 48350 | Jack Davenport | 0.0 | goddess|marriage ceremony|marriage proposal|pirate|singapore | http://www.imdb.com/title/tt0449088/?ref_=fn_tt_tt_1 | 1238.0 | English | USA | PG-13 | 300000000.0 | 2007.0 | 5000.0 | 7.1 | 2.35 | 0 |
| 2 | Spectre | Color | Sam Mendes | 602.0 | 148.0 | 0.0 | 161.0 | Rory Kinnear | 11000.0 | 200074175.0 | Action|Adventure|Thriller | Christoph Waltz | 275868 | 11700 | Stephanie Sigman | 1.0 | bomb|espionage|sequel|spy|terrorist | http://www.imdb.com/title/tt2379713/?ref_=fn_tt_tt_1 | 994.0 | English | UK | PG-13 | 245000000.0 | 2015.0 | 393.0 | 6.8 | 2.35 | 85000 |
| 3 | The Dark Knight Rises | Color | Christopher Nolan | 813.0 | 164.0 | 22000.0 | 23000.0 | Christian Bale | 27000.0 | 448130642.0 | Action|Thriller | Tom Hardy | 1144337 | 106759 | Joseph Gordon-Levitt | 0.0 | deception|imprisonment|lawlessness|police officer|terrorist plot | http://www.imdb.com/title/tt1345836/?ref_=fn_tt_tt_1 | 2701.0 | English | USA | PG-13 | 250000000.0 | 2012.0 | 23000.0 | 8.5 | 2.35 | 164000 |
| 4 | Star Wars: Episode VII - The Force Awakens | NaN | Doug Walker | NaN | NaN | 131.0 | NaN | Rob Walker | 131.0 | NaN | Documentary | Doug Walker | 8 | 143 | NaN | 0.0 | NaN | http://www.imdb.com/title/tt5289954/?ref_=fn_tt_tt_1 | NaN | NaN | NaN | NaN | NaN | NaN | 12.0 | 7.1 | NaN | 0 |
| 5 | John Carter | Color | Andrew Stanton | 462.0 | 132.0 | 475.0 | 530.0 | Samantha Morton | 640.0 | 73058679.0 | Action|Adventure|Sci-Fi | Daryl Sabara | 212204 | 1873 | Polly Walker | 1.0 | alien|american civil war|male nipple|mars|princess | http://www.imdb.com/title/tt0401729/?ref_=fn_tt_tt_1 | 738.0 | English | USA | PG-13 | 263700000.0 | 2012.0 | 632.0 | 6.6 | 2.35 | 24000 |
| 6 | Spider-Man 3 | Color | Sam Raimi | 392.0 | 156.0 | 0.0 | 4000.0 | James Franco | 24000.0 | 336530303.0 | Action|Adventure|Romance | J.K. Simmons | 383056 | 46055 | Kirsten Dunst | 0.0 | sandman|spider man|symbiote|venom|villain | http://www.imdb.com/title/tt0413300/?ref_=fn_tt_tt_1 | 1902.0 | English | USA | PG-13 | 258000000.0 | 2007.0 | 11000.0 | 6.2 | 2.35 | 0 |
| 7 | Tangled | Color | Nathan Greno | 324.0 | 100.0 | 15.0 | 284.0 | Donna Murphy | 799.0 | 200807262.0 | Adventure|Animation|Comedy|Family|Fantasy|Musical|Romance | Brad Garrett | 294810 | 2036 | M.C. Gainey | 1.0 | 17th century|based on fairy tale|disney|flower|tower | http://www.imdb.com/title/tt0398286/?ref_=fn_tt_tt_1 | 387.0 | English | USA | PG | 260000000.0 | 2010.0 | 553.0 | 7.8 | 1.85 | 29000 |
| 8 | Avengers: Age of Ultron | Color | Joss Whedon | 635.0 | 141.0 | 0.0 | 19000.0 | Robert Downey Jr. | 26000.0 | 458991599.0 | Action|Adventure|Sci-Fi | Chris Hemsworth | 462669 | 92000 | Scarlett Johansson | 4.0 | artificial intelligence|based on comic book|captain america|marvel cinematic universe|superhero | http://www.imdb.com/title/tt2395427/?ref_=fn_tt_tt_1 | 1117.0 | English | USA | PG-13 | 250000000.0 | 2015.0 | 21000.0 | 7.5 | 2.35 | 118000 |
| 9 | Harry Potter and the Half-Blood Prince | Color | David Yates | 375.0 | 153.0 | 282.0 | 10000.0 | Daniel Radcliffe | 25000.0 | 301956980.0 | Adventure|Family|Fantasy|Mystery | Alan Rickman | 321795 | 58753 | Rupert Grint | 3.0 | blood|book|love|potion|professor | http://www.imdb.com/title/tt0417741/?ref_=fn_tt_tt_1 | 973.0 | English | UK | PG | 250000000.0 | 2009.0 | 11000.0 | 7.5 | 2.35 | 10000 |
Last rows
| movie_title | color | director_name | num_critic_for_reviews | duration | director_facebook_likes | actor_3_facebook_likes | actor_2_name | actor_1_facebook_likes | gross | genres | actor_1_name | num_voted_users | cast_total_facebook_likes | actor_3_name | facenumber_in_poster | plot_keywords | movie_imdb_link | num_user_for_reviews | language | country | content_rating | budget | title_year | actor_2_facebook_likes | imdb_score | aspect_ratio | movie_facebook_likes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4906 | Primer | Color | Shane Carruth | 143.0 | 77.0 | 291.0 | 8.0 | David Sullivan | 291.0 | 424760.0 | Drama|Sci-Fi|Thriller | Shane Carruth | 72639 | 368 | Casey Gooden | 0.0 | changing the future|independent film|invention|nonlinear timeline|time travel | http://www.imdb.com/title/tt0390384/?ref_=fn_tt_tt_1 | 371.0 | English | USA | PG-13 | 7000.0 | 2004.0 | 45.0 | 7.0 | 1.85 | 19000 |
| 4907 | Cavite | Color | Neill Dela Llana | 35.0 | 80.0 | 0.0 | 0.0 | Edgar Tancangco | 0.0 | 70071.0 | Thriller | Ian Gamazon | 589 | 0 | Quynn Ton | 0.0 | jihad|mindanao|philippines|security guard|squatter | http://www.imdb.com/title/tt0428303/?ref_=fn_tt_tt_1 | 35.0 | English | Philippines | Not Rated | 7000.0 | 2005.0 | 0.0 | 6.3 | NaN | 74 |
| 4908 | El Mariachi | Color | Robert Rodriguez | 56.0 | 81.0 | 0.0 | 6.0 | Peter Marquardt | 121.0 | 2040920.0 | Action|Crime|Drama|Romance|Thriller | Carlos Gallardo | 52055 | 147 | Consuelo Gómez | 0.0 | assassin|death|guitar|gun|mariachi | http://www.imdb.com/title/tt0104815/?ref_=fn_tt_tt_1 | 130.0 | Spanish | USA | R | 7000.0 | 1992.0 | 20.0 | 6.9 | 1.37 | 0 |
| 4909 | The Mongol King | Color | Anthony Vallone | NaN | 84.0 | 2.0 | 2.0 | John Considine | 45.0 | NaN | Crime|Drama | Richard Jewell | 36 | 93 | Sara Stepnicka | 0.0 | jewell|mongol|nostradamus|stepnicka|vallone | http://www.imdb.com/title/tt0430371/?ref_=fn_tt_tt_1 | 1.0 | English | USA | PG-13 | 3250.0 | 2005.0 | 44.0 | 7.8 | NaN | 4 |
| 4910 | Newlyweds | Color | Edward Burns | 14.0 | 95.0 | 0.0 | 133.0 | Caitlin FitzGerald | 296.0 | 4584.0 | Comedy|Drama | Kerry Bishé | 1338 | 690 | Daniella Pineda | 1.0 | written and directed by cast member | http://www.imdb.com/title/tt1880418/?ref_=fn_tt_tt_1 | 14.0 | English | USA | Not Rated | 9000.0 | 2011.0 | 205.0 | 6.4 | NaN | 413 |
| 4911 | Signed Sealed Delivered | Color | Scott Smith | 1.0 | 87.0 | 2.0 | 318.0 | Daphne Zuniga | 637.0 | NaN | Comedy|Drama | Eric Mabius | 629 | 2283 | Crystal Lowe | 2.0 | fraud|postal worker|prison|theft|trial | http://www.imdb.com/title/tt3000844/?ref_=fn_tt_tt_1 | 6.0 | English | Canada | NaN | NaN | 2013.0 | 470.0 | 7.7 | NaN | 84 |
| 4912 | The Following | Color | NaN | 43.0 | 43.0 | NaN | 319.0 | Valorie Curry | 841.0 | NaN | Crime|Drama|Mystery|Thriller | Natalie Zea | 73839 | 1753 | Sam Underwood | 1.0 | cult|fbi|hideout|prison escape|serial killer | http://www.imdb.com/title/tt2071645/?ref_=fn_tt_tt_1 | 359.0 | English | USA | TV-14 | NaN | NaN | 593.0 | 7.5 | 16.00 | 32000 |
| 4913 | A Plague So Pleasant | Color | Benjamin Roberds | 13.0 | 76.0 | 0.0 | 0.0 | Maxwell Moody | 0.0 | NaN | Drama|Horror|Thriller | Eva Boehnke | 38 | 0 | David Chandler | 0.0 | NaN | http://www.imdb.com/title/tt2107644/?ref_=fn_tt_tt_1 | 3.0 | English | USA | NaN | 1400.0 | 2013.0 | 0.0 | 6.3 | NaN | 16 |
| 4914 | Shanghai Calling | Color | Daniel Hsia | 14.0 | 100.0 | 0.0 | 489.0 | Daniel Henney | 946.0 | 10443.0 | Comedy|Drama|Romance | Alan Ruck | 1255 | 2386 | Eliza Coupe | 5.0 | NaN | http://www.imdb.com/title/tt2070597/?ref_=fn_tt_tt_1 | 9.0 | English | USA | PG-13 | NaN | 2012.0 | 719.0 | 6.3 | 2.35 | 660 |
| 4915 | My Date with Drew | Color | Jon Gunn | 43.0 | 90.0 | 16.0 | 16.0 | Brian Herzlinger | 86.0 | 85222.0 | Documentary | John August | 4285 | 163 | Jon Gunn | 0.0 | actress name in title|crush|date|four word title|video camera | http://www.imdb.com/title/tt0378407/?ref_=fn_tt_tt_1 | 84.0 | English | USA | PG | 1100.0 | 2004.0 | 23.0 | 6.6 | 1.85 | 456 |